Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 52
Filtrar
1.
Syst Rev ; 8(1): 278, 2019 11 15.
Artigo em Inglês | MEDLINE | ID: mdl-31727150

RESUMO

BACKGROUND: We explored the performance of three machine learning tools designed to facilitate title and abstract screening in systematic reviews (SRs) when used to (a) eliminate irrelevant records (automated simulation) and (b) complement the work of a single reviewer (semi-automated simulation). We evaluated user experiences for each tool. METHODS: We subjected three SRs to two retrospective screening simulations. In each tool (Abstrackr, DistillerSR, RobotAnalyst), we screened a 200-record training set and downloaded the predicted relevance of the remaining records. We calculated the proportion missed and workload and time savings compared to dual independent screening. To test user experiences, eight research staff tried each tool and completed a survey. RESULTS: Using Abstrackr, DistillerSR, and RobotAnalyst, respectively, the median (range) proportion missed was 5 (0 to 28) percent, 97 (96 to 100) percent, and 70 (23 to 100) percent for the automated simulation and 1 (0 to 2) percent, 2 (0 to 7) percent, and 2 (0 to 4) percent for the semi-automated simulation. The median (range) workload savings was 90 (82 to 93) percent, 99 (98 to 99) percent, and 85 (85 to 88) percent for the automated simulation and 40 (32 to 43) percent, 49 (48 to 49) percent, and 35 (34 to 38) percent for the semi-automated simulation. The median (range) time savings was 154 (91 to 183), 185 (95 to 201), and 157 (86 to 172) hours for the automated simulation and 61 (42 to 82), 92 (46 to 100), and 64 (37 to 71) hours for the semi-automated simulation. Abstrackr identified 33-90% of records missed by a single reviewer. RobotAnalyst performed less well and DistillerSR provided no relative advantage. User experiences depended on user friendliness, qualities of the user interface, features and functions, trustworthiness, ease and speed of obtaining predictions, and practicality of the export file(s). CONCLUSIONS: The workload savings afforded in the automated simulation came with increased risk of missing relevant records. Supplementing a single reviewer's decisions with relevance predictions (semi-automated simulation) sometimes reduced the proportion missed, but performance varied by tool and SR. Designing tools based on reviewers' self-identified preferences may improve their compatibility with present workflows. SYSTEMATIC REVIEW REGISTRATION: Not applicable.


Assuntos
Armazenamento e Recuperação da Informação/métodos , Aprendizado de Máquina , Software , Indexação e Redação de Resumos/classificação , Humanos , Reprodutibilidade dos Testes , Revisões Sistemáticas como Assunto , Fatores de Tempo , Carga de Trabalho
2.
Syst Rev ; 8(1): 277, 2019 11 15.
Artigo em Inglês | MEDLINE | ID: mdl-31727159

RESUMO

BACKGROUND: Web applications that employ natural language processing technologies to support systematic reviewers during abstract screening have become more common. The goal of our project was to conduct a case study to explore a screening approach that temporarily replaces a human screener with a semi-automated screening tool. METHODS: We evaluated the accuracy of the approach using DistillerAI as a semi-automated screening tool. A published comparative effectiveness review served as the reference standard. Five teams of professional systematic reviewers screened the same 2472 abstracts in parallel. Each team trained DistillerAI with 300 randomly selected abstracts that the team screened dually. For all remaining abstracts, DistillerAI replaced one human screener and provided predictions about the relevance of records. A single reviewer also screened all remaining abstracts. A second human screener resolved conflicts between the single reviewer and DistillerAI. We compared the decisions of the machine-assisted approach, single-reviewer screening, and screening with DistillerAI alone against the reference standard. RESULTS: The combined sensitivity of the machine-assisted screening approach across the five screening teams was 78% (95% confidence interval [CI], 66 to 90%), and the combined specificity was 95% (95% CI, 92 to 97%). By comparison, the sensitivity of single-reviewer screening was similar (78%; 95% CI, 66 to 89%); however, the sensitivity of DistillerAI alone was substantially worse (14%; 95% CI, 0 to 31%) than that of the machine-assisted screening approach. Specificities for single-reviewer screening and DistillerAI were 94% (95% CI, 91 to 97%) and 98% (95% CI, 97 to 100%), respectively. Machine-assisted screening and single-reviewer screening had similar areas under the curve (0.87 and 0.86, respectively); by contrast, the area under the curve for DistillerAI alone was just slightly better than chance (0.56). The interrater agreement between human screeners and DistillerAI with a prevalence-adjusted kappa was 0.85 (95% CI, 0.84 to 0.86%). CONCLUSIONS: The accuracy of DistillerAI is not yet adequate to replace a human screener temporarily during abstract screening for systematic reviews. Rapid reviews, which do not require detecting the totality of the relevant evidence, may find semi-automation tools to have greater utility than traditional systematic reviews.


Assuntos
Armazenamento e Recuperação da Informação/métodos , Processamento de Linguagem Natural , Software , Indexação e Redação de Resumos/classificação , Humanos , Internet , Reprodutibilidade dos Testes , Sensibilidade e Especificidade , Revisões Sistemáticas como Assunto
3.
Rev. cuba. oftalmol ; 29(3): 516-566, jul.-set. 2016.
Artigo em Espanhol | CUMED | ID: cum-64711

RESUMO

Desde la aparición de la primera publicación cubana sobre Oftalmología en el año 1919 hasta 1961 pasaron 43 años, donde solo hubo revistas en 26 de esos años y la de más duración en forma continua fue de solo siete años, lo que denota una inconsistencia en la literatura oftalmológica de Cuba, por lo que se perdieron importantes materiales científicos, así como actualizaciones registradas y noticias del acontecer de la especialidad en el país. Nos propusimos hacer una revisión acerca de las publicaciones de esta especialidad desde su surgimiento hasta la actualidad(AU)


Since the emergence of the first Cuban publications about ophthalmology in 1919 till 1961, forty three years have elapsed but just in 26 of them, several journals were published whereas the ones that lasted longer on a continuous basis were published in seven years. All this reveals some inconsistency of the ophthalmological literature in Cuba due to loss of important scientific materials as well as recorded updates and news about the events within this specialty in the country. The objective of this paper was to make a review of the publications about ophthalmology since its inception up to now(AU)


Assuntos
Humanos , Publicações Científicas e Técnicas , Oftalmologia/história , Indexação e Redação de Resumos/classificação
4.
Rev. cuba. oftalmol ; 29(3): 516-566, jul.-set. 2016.
Artigo em Espanhol | LILACS | ID: biblio-830486

RESUMO

Desde la aparición de la primera publicación cubana sobre Oftalmología en el año 1919 hasta 1961 pasaron 43 años, donde solo hubo revistas en 26 de esos años y la de más duración en forma continua fue de solo siete años, lo que denota una inconsistencia en la literatura oftalmológica de Cuba, por lo que se perdieron importantes materiales científicos, así como actualizaciones registradas y noticias del acontecer de la especialidad en el país. Nos propusimos hacer una revisión acerca de las publicaciones de esta especialidad desde su surgimiento hasta la actualidad(AU)


Since the emergence of the first Cuban publications about ophthalmology in 1919 till 1961, forty three years have elapsed but just in 26 of them, several journals were published whereas the ones that lasted longer on a continuous basis were published in seven years. All this reveals some inconsistency of the ophthalmological literature in Cuba due to loss of important scientific materials as well as recorded updates and news about the events within this specialty in the country. The objective of this paper was to make a review of the publications about ophthalmology since its inception up to now(AU)


Assuntos
Humanos , Publicações Científicas e Técnicas , Indexação e Redação de Resumos/classificação , Oftalmologia/história
6.
Braz. j. pharm. sci ; 50(3): 529-534, Jul-Sep/2014. graf
Artigo em Inglês | LILACS | ID: lil-728701

RESUMO

This study aimed to analyze whether ecstasy consumption is associated with the socioeconomic status in the Municipality of São Paulo, Brazil, from 2000 to 2007. We used an official, reliable and unbiased source supplied by the Department of Narcotics of the State of São Paulo (Denarc) database and the Human Development Index of the districts (HDId) where the seizures occurred. A Spearman correlation test between the average number of ecstasy seizures per million of inhabitants with the HDId was used. There were 190 seizures (totaling 47,934 tablets) spread out in 53 of the 96 districts and 51.6% were concentrated in only 8 districts. The higher rates of ecstasy seizures were directly associated with districts with high HDId that confirmed the association of ecstasy consumption with the socioeconomic status. Itaim-Bibi, Jardim Paulista and Moema were the top three districts with the highest HDId. In these districts, the number of tablets per seizure ranged from as few units to thousands, revealing that not only consumption but also traffic coexist at the same place. Districts with many nightclubs can also influence the incidence of seizures. This knowledge can be useful to help the police from other Brazilian cities to combat ecstasy trafficking.


Este estudo teve como objetivo analisar se o consumo de ecstasy está associado com o nível socioeconômico no Município de São Paulo, Brasil, de 2000 a 2007. Nós usamos uma fonte oficial, confiável e imparcial fornecida pelo banco de dados do Departamento de Narcóticos do Estado de São Paulo (Denarc) e o Índice de Desenvolvimento Humano dos distritos (IDHd), onde as apreensões ocorreram. O teste de correlação de Spearman entre o número médio de apreensões de ecstasy por milhão de habitantes e o IDHd foi utilizado. Houve 190 apreensões (totalizando 47.934 comprimidos) distribuídas em 53 dos 96 distritos e 51,6% concentraram-se em apenas 8 distritos. As maiores taxas de apreensão de ecstasy foram diretamente associadas com distritos com alto IDHd, o que confirmou a associação do consumo de ecstasy com o nível socioeconômico. Itaim Bibi, Jardim Paulista e Moema, distritos como os maiores IDHd, foram os três primeiros colocados. Nesses distritos, o número de comprimidos por apreensão variou de poucas unidades a milhares, revelando que não somente o consumo como também o tráfico coexiste no mesmo local. Distritos com muitas casas noturnas também podem influenciar a incidência de apreensão. Esse conhecimento pode ser útil no auxílio à polícia de outras cidades brasileiras no combate ao tráfico de ecstasy.


Assuntos
N-Metil-3,4-Metilenodioxianfetamina/análise , Indexação e Redação de Resumos/classificação , Comportamento de Procura de Droga , Convulsões , Fatores Socioeconômicos
8.
PLoS One ; 7(9): e44183, 2012.
Artigo em Inglês | MEDLINE | ID: mdl-22984474

RESUMO

PURPOSE: To reduce publication bias, systematic reviewers are advised to search conference abstracts to identify randomized controlled trials (RCTs) conducted in humans and not published in full. We assessed the information provided by authors to aid identification of RCTs for reviews. METHODS: We handsearched the Association for Research in Vision and Ophthalmology (ARVO) meeting abstracts for 2004 to 2009 to identify reports of RCTs. We compared our classification with that of authors (requested by ARVO 2004-2006), and authors' report of trial registration (required by ARVO 2007-2009). RESULTS: Authors identified their study as a clinical trial for 169/191 (88%; 95% CI, 84-93) RCTs we identified for 2004, 174/212 (82%; 95% CI, 77-87) for 2005 and 162/215 (75%; 95% CI, 70-81) for 2006. Authors provided registration information for 107/172 (62%; 95% CI, 55-69) RCTs for 2007, 103/153 (67%; 95% CI, 60-75) for 2008, and 126/171 (74%; 95% CI, 67-80) for 2009. Most RCT authors providing a trial register name specified ClinicalTrials.gov (276/312; 88%; 95% CI, 85-92) and provided a valid ClinicalTrials.gov registration number (261/276; 95%; 95% CI, 92-97). Based on information provided by authors, trial registration information would be accessible for 48% (83/172) (95% CI, 41-56) of all ARVO abstracts describing RCTs in 2007, 63% (96/153) (95% CI, 55-70) in 2008, and 70% in 2009 (118/171) (95% CI, 62-76). CONCLUSIONS: Authors of abstracts describing RCTs frequently did not classify them as clinical trials nor comply with reporting trial registration information, as required by the conference organizers. Systematic reviewers cannot rely on authors to identify relevant unpublished trials or report trial registration, if present.


Assuntos
Ensaios Clínicos Controlados Aleatórios como Assunto , Sistema de Registros , Pesquisadores , Indexação e Redação de Resumos/classificação , Autoria , Humanos , Padrões de Referência
10.
Braz Oral Res ; 25(3): 197-204, 2011.
Artigo em Inglês | MEDLINE | ID: mdl-21670851

RESUMO

The aim of the present study was to analyze dental research trends in Brazil over the past nine years. All abstracts presented at the 26th Annual Meeting of the Brazilian Society for Dental Research in 2009 (n = 2648) were classified based on field of knowledge, home institution and geographic region. Data were compared with those previously published based on abstracts presented at various meetings. Between 2001 and 2006, five fields of knowledge had a greater than 10% representation among the total number of studies. These fields included restorative dentistry/dental materials (RD/DM), periodontics, endodontics, pediatric dentistry and population-based oral health. In 2009, only RD/DM maintained a greater than 10% proportion of meeting abstracts, and basic fields comprised the second position among those fields with greater representation (9.8%). The majority of research studies were performed at public institutions, and the number of abstracts per state increased significantly in 2009 (Wilcoxon test, p < 0.001). The southeastern region of Brazil submitted the greatest number of abstracts; however, other regions also demonstrated increased participation in research (11%). The percentage distribution of abstracts between states remained constant (Wilcoxon test, p = 0.255; r s = 0.873). The results of the present study suggest a slight shift in the scientific research profile in Brazilian dentistry: fields related to professional disciplines have declined in relative research participation, while increasing interest has been observed in basic fields and new specialties.


Assuntos
Indexação e Redação de Resumos/estatística & dados numéricos , Bibliometria , Pesquisa em Odontologia/tendências , Indexação e Redação de Resumos/classificação , Brasil , Congressos como Assunto , Pesquisa em Odontologia/estatística & dados numéricos , Humanos , Editoração/estatística & dados numéricos , Estudos Retrospectivos , Estatísticas não Paramétricas
11.
Braz. oral res ; 25(3): 197-204, May-June 2011.
Artigo em Inglês | LILACS | ID: lil-590038

RESUMO

The aim of the present study was to analyze dental research trends in Brazil over the past nine years. All abstracts presented at the 26th Annual Meeting of the Brazilian Society for Dental Research in 2009 (n = 2648) were classified based on field of knowledge, home institution and geographic region. Data were compared with those previously published based on abstracts presented at various meetings. Between 2001 and 2006, five fields of knowledge had a greater than 10 percent representation among the total number of studies. These fields included restorative dentistry/dental materials (RD/DM), periodontics, endodontics, pediatric dentistry and population-based oral health. In 2009, only RD/DM maintained a greater than 10 percent proportion of meeting abstracts, and basic fields comprised the second position among those fields with greater representation (9.8 percent). The majority of research studies were performed at public institutions, and the number of abstracts per state increased significantly in 2009 (Wilcoxon test, p < 0.001). The southeastern region of Brazil submitted the greatest number of abstracts; however, other regions also demonstrated increased participation in research (11 percent). The percentage distribution of abstracts between states remained constant (Wilcoxon test, p = 0.255; r s = 0.873). The results of the present study suggest a slight shift in the scientific research profile in Brazilian dentistry: fields related to professional disciplines have declined in relative research participation, while increasing interest has been observed in basic fields and new specialties.


Assuntos
Humanos , Indexação e Redação de Resumos/estatística & dados numéricos , Bibliometria , Pesquisa em Odontologia/tendências , Indexação e Redação de Resumos/classificação , Brasil , Congressos como Assunto , Pesquisa em Odontologia/estatística & dados numéricos , Editoração/estatística & dados numéricos , Estudos Retrospectivos , Estatísticas não Paramétricas
13.
BMC Bioinformatics ; 12: 69, 2011 Mar 08.
Artigo em Inglês | MEDLINE | ID: mdl-21385430

RESUMO

BACKGROUND: Many practical tasks in biomedicine require accessing specific types of information in scientific literature; e.g. information about the results or conclusions of the study in question. Several schemes have been developed to characterize such information in scientific journal articles. For example, a simple section-based scheme assigns individual sentences in abstracts under sections such as Objective, Methods, Results and Conclusions. Some schemes of textual information structure have proved useful for biomedical text mining (BIO-TM) tasks (e.g. automatic summarization). However, user-centered evaluation in the context of real-life tasks has been lacking. METHODS: We take three schemes of different type and granularity--those based on section names, Argumentative Zones (AZ) and Core Scientific Concepts (CoreSC)--and evaluate their usefulness for a real-life task which focuses on biomedical abstracts: Cancer Risk Assessment (CRA). We annotate a corpus of CRA abstracts according to each scheme, develop classifiers for automatic identification of the schemes in abstracts, and evaluate both the manual and automatic classifications directly as well as in the context of CRA. RESULTS: Our results show that for each scheme, the majority of categories appear in abstracts, although two of the schemes (AZ and CoreSC) were developed originally for full journal articles. All the schemes can be identified in abstracts relatively reliably using machine learning. Moreover, when cancer risk assessors are presented with scheme annotated abstracts, they find relevant information significantly faster than when presented with unannotated abstracts, even when the annotations are produced using an automatic classifier. Interestingly, in this user-based evaluation the coarse-grained scheme based on section names proved nearly as useful for CRA as the finest-grained CoreSC scheme. CONCLUSIONS: We have shown that existing schemes aimed at capturing information structure of scientific documents can be applied to biomedical abstracts and can be identified in them automatically with an accuracy which is high enough to benefit a real-life task in biomedicine.


Assuntos
Inteligência Artificial , Mineração de Dados , Processamento Eletrônico de Dados/métodos , Neoplasias , Indexação e Redação de Resumos/classificação , Biologia Computacional/métodos , Humanos , Medição de Risco
14.
Crisis ; 31(5): 281-4, 2010.
Artigo em Inglês | MEDLINE | ID: mdl-21134848

RESUMO

BACKGROUND: Despite the growing strength of the field of suicidology, various commentators have recently noted that insufficient effort is being put into intervention research, and that this is limiting our knowledge of which suicide prevention strategies might be the most effective. AIMS: To profile the types of studies currently being undertaken by suicide prevention researchers from around the world, in order to examine the relative balance between intervention studies and other types of research. METHODS: We searched the abstract books from the 22nd, 23rd, and 24th Congresses of the International Association for Suicide Prevention and the 10th, 11th, and 12th European Symposia on Suicide and Suicidal Behavior (held between 2003 and 2008), and classified the abstracts in them according to a modified version of an existing taxonomy. RESULTS: We screened 1209 abstracts and found that only 12% described intervention studies. CONCLUSIONS: We need to redouble our efforts and make intervention studies our priority if we are to combat the global problem of suicide.


Assuntos
Prevenção ao Suicídio , Indexação e Redação de Resumos/classificação , Indexação e Redação de Resumos/estatística & dados numéricos , Congressos como Assunto , Estudos Epidemiológicos , Necessidades e Demandas de Serviços de Saúde , Humanos , Viés de Publicação , Projetos de Pesquisa/estatística & dados numéricos , Sociedades Científicas , Suicídio/estatística & dados numéricos
15.
Health Info Libr J ; 27(3): 235-43, 2010 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-20712718

RESUMO

BACKGROUND: Visual findings summarized in the figures and tables of academic papers are invaluable sources for biomedical researchers. Captions associated with the visual findings are often neglected while retrieving biomedical images in published academic papers. OBJECTIVES: This study is to assess caption-based topical descriptors for microscopic images of breast neoplasms, as published in academic papers retrieved through the PubMed Central database. METHOD: Human indexers as well as an automatic keyword finder called TAPoR generated the topical descriptors from collected captions. The study then compared the human-generated descriptors to machine-generated descriptors. Finally, a set of core descriptors was developed from both sets and automatically mapped into the Unified Medical Language System's (UMLS) Metathesaurus through a MetaMap Transfer engine. RESULTS: Major topical descriptors included histologic disease names, laboratory procedures, genetic functions and components. Human indexers provided more relevant descriptors than TAPoR. The UMLS Metathesaurus identified several semantic types including Indicator, Reagent, or Diagnostic Aid; Organic Chemical; Laboratory Procedure; Spatial Concept; Qualitative Concept; and Quantitative Concept. DISCUSSION: The findings suggest that caption-based descriptors can complement title or abstract-based literature indexing for figure image retrieval in articles. With respect to forming a metadata framework for online microscopic image description, the semantic types can be used as a core metadata set. In this regard, this finding can be used for standardising a microscopic image description protocol to train medical students. CONCLUSIONS: It is incumbent upon libraries and other information agencies to promote and maintain an interest in the opportunities and challenges associated with biomedical imaging.


Assuntos
Indexação e Redação de Resumos/classificação , Manuscritos Médicos como Assunto , Microscopia/classificação , Fotografação/classificação , Editoração , Unified Medical Language System/classificação , Neoplasias da Mama/patologia , Diagnóstico por Imagem/classificação , Feminino , Humanos , Armazenamento e Recuperação da Informação/métodos
16.
Artigo em Inglês | MEDLINE | ID: mdl-20671313

RESUMO

We participated (as Team 9) in the Article Classification Task of the Biocreative II.5 Challenge: binary classification of full-text documents relevant for protein-protein interaction. We used two distinct classifiers for the online and offline challenges: 1) the lightweight Variable Trigonometric Threshold (VTT) linear classifier we successfully introduced in BioCreative 2 for binary classification of abstracts and 2) a novel Naive Bayes classifier using features from the citation network of the relevant literature. We supplemented the supplied training data with full-text documents from the MIPS database. The lightweight VTT classifier was very competitive in this new full-text scenario: it was a top-performing submission in this task, taking into account the rank product of the Area Under the interpolated precision and recall Curve, Accuracy, Balanced F-Score, and Matthew's Correlation Coefficient performance measures. The novel citation network classifier for the biomedical text mining domain, while not a top performing classifier in the challenge, performed above the central tendency of all submissions, and therefore indicates a promising new avenue to investigate further in bibliome informatics.


Assuntos
Indexação e Redação de Resumos/classificação , Biologia Computacional/métodos , Mineração de Dados/métodos , Mapeamento de Interação de Proteínas/classificação , Algoritmos , Bases de Dados Bibliográficas , Redes Neurais de Computação , Publicações Periódicas como Assunto
17.
J Occup Rehabil ; 20(4): 502-11, 2010 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-20514511

RESUMO

INTRODUCTION: The consequences of accidents, injuries, and health conditions that prevent workers from engaging in employment are prevailing issues in the area of work disability. Vocational rehabilitation (VR) programs aim to facilitate return-to-work process but there is no universal description of functioning for patients who participate in VR. Our objective is to develop a Core Set for VR based on the international classification of functioning, disability, and health (ICF). An ICF Core Set is a short list of ICF categories with alphanumeric codes relevant to a health condition or a health-related event. METHODS: Development process consists of three phases. First is the preparatory phase which consists of four parallel studies: (1) systematic review of the literature, (2) worldwide survey of experts, (3) cross-sectional study, and (4) focus group interview. Patients with various health conditions are to be recruited from five VR centers located in Switzerland and Germany. The second phase is a consensus conference where findings from the preparatory phase will be presented followed by a multi-stage consensus process to determine the ICF categories that will comprise the Core Set for VR. The final phase consists of validation studies in several health conditions and settings. CONCLUSIONS: We expect the first version of the ICF Core Set for VR to be completed in 2010. The Core Set can serve as a guide in the evaluation of patients and in planning appropriate intervention within VR programs. This Core Set could also provide a standard and common language among clinicians, researchers, insurers, and policymakers in the implementation of successful VR.


Assuntos
Avaliação da Deficiência , Pessoas com Deficiência/classificação , Nível de Saúde , Classificação Internacional de Doenças/classificação , Reabilitação Vocacional/classificação , Indexação e Redação de Resumos/classificação , Atividades Cotidianas , Estudos Transversais , Pessoas com Deficiência/reabilitação , Grupos Focais , Humanos , Projetos Piloto , Reprodutibilidade dos Testes , Índice de Gravidade de Doença , Organização Mundial da Saúde
18.
BMC Med Inform Decis Mak ; 10: 29, 2010 May 15.
Artigo em Inglês | MEDLINE | ID: mdl-20470429

RESUMO

BACKGROUND: Formulating a clinical information need in terms of the four atomic parts which are Population/Problem, Intervention, Comparison and Outcome (known as PICO elements) facilitates searching for a precise answer within a large medical citation database. However, using PICO defined items in the information retrieval process requires a search engine to be able to detect and index PICO elements in the collection in order for the system to retrieve relevant documents. METHODS: In this study, we tested multiple supervised classification algorithms and their combinations for detecting PICO elements within medical abstracts. Using the structural descriptors that are embedded in some medical abstracts, we have automatically gathered large training/testing data sets for each PICO element. RESULTS: Combining multiple classifiers using a weighted linear combination of their prediction scores achieves promising results with an f-measure score of 86.3% for P, 67% for I and 56.6% for O. CONCLUSIONS: Our experiments on the identification of PICO elements showed that the task is very challenging. Nevertheless, the performance achieved by our identification method is competitive with previously published results and shows that this task can be achieved with a high accuracy for the P element but lower ones for I and O elements.


Assuntos
Indexação e Redação de Resumos/classificação , Algoritmos , Armazenamento e Recuperação da Informação/métodos , Bases de Dados Bibliográficas
20.
BMC Bioinformatics ; 9: 402, 2008 Sep 25.
Artigo em Inglês | MEDLINE | ID: mdl-18817555

RESUMO

BACKGROUND: The rapid growth of biomedical literature presents challenges for automatic text processing, and one of the challenges is abbreviation identification. The presence of unrecognized abbreviations in text hinders indexing algorithms and adversely affects information retrieval and extraction. Automatic abbreviation definition identification can help resolve these issues. However, abbreviations and their definitions identified by an automatic process are of uncertain validity. Due to the size of databases such as MEDLINE only a small fraction of abbreviation-definition pairs can be examined manually. An automatic way to estimate the accuracy of abbreviation-definition pairs extracted from text is needed. In this paper we propose an abbreviation definition identification algorithm that employs a variety of strategies to identify the most probable abbreviation definition. In addition our algorithm produces an accuracy estimate, pseudo-precision, for each strategy without using a human-judged gold standard. The pseudo-precisions determine the order in which the algorithm applies the strategies in seeking to identify the definition of an abbreviation. RESULTS: On the Medstract corpus our algorithm produced 97% precision and 85% recall which is higher than previously reported results. We also annotated 1250 randomly selected MEDLINE records as a gold standard. On this set we achieved 96.5% precision and 83.2% recall. This compares favourably with the well known Schwartz and Hearst algorithm. CONCLUSION: We developed an algorithm for abbreviation identification that uses a variety of strategies to identify the most probable definition for an abbreviation and also produces an estimated accuracy of the result. This process is purely automatic.


Assuntos
Abreviaturas como Assunto , Indexação e Redação de Resumos/métodos , Processamento de Linguagem Natural , Reconhecimento Automatizado de Padrão/métodos , Indexação e Redação de Resumos/classificação , Bases de Dados Bibliográficas/estatística & dados numéricos , Humanos
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...